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This paper presents two descriptors to tackle the existing problems 
in medical imaging by providing more information to describe 
different textural structures of digital images. The proposed global 
and local descriptors can provide more accurate analysis of medical 
features by using hybrid concatenation approach. Several 
mathematical models in the form of local and global descriptors have 
been developed and used in the computation and analysis of medical 
problems. The experimental results showed that both local and 
global features are very useful in detection and analysis of 
biomedical features. The results also indicate that the global 
descriptor outperforms the earlier approaches and demonstrates 
high discriminating power and robustness of combined features for 
accurate classification of CT images. 
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1. Introduction 

Ordinary human may find it difficult to visualize and extract general information from images or 
objects. The useful information extracted from those images or objects are referred to as features. 
These objects can be represented in form of two or three dimensions and the extracted features or 
useful information would be smaller in size compared to the original image. The reason for the size 
or dimensional reduction in the feature extracted is simply because some irrelevant materials must 
have been eliminated from the original images after applying preprocessing algorithms. This is a 
very important stage in image processing as the process would help programmers to focus on the 
important features that would greatly improve the computational accuracy of the experimental 
results. This process shows how important feature extraction is in image analysis. However, one 
must be very careful at this stage to ensure the important information is not being removed during 
this process. Some factors must be taken into consideration before you remove regions or parts of 
the image. In computing, we can develop an algorithm or construct a model to achieve favourable 
results in this process. 

Local features can be different from global features since the former represents a subset of the latter. 
In other words, those features extracted from certain parts of the image or objects are local while 
global features capture the entire object. During the classification process, several patches or local 
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features can be extracted from a global image because global representation or descriptors describe 
the general property of the entire image. A very good example could be a multifractal model for 
effective analysis and classification of medical images developed by [1-6]. Multifractal descriptor 
represents a global description of images as it showcases the characteristic properties of the entire 
object while local features can be obtained from local binary patterns (LBP). Both descriptors could 
be used for medical image descriptions and analyses. Hopefully, a combination of the features from 
local and global descriptors could yield powerful features for efficient classification and analysis of 
images. 


1.1. Techniques in Feature Extraction 

Strong background knowledge in mathematics can help to manipulate image pixels and develop 
feature techniques that could be used to differentiate between two closely related objects. The 
techniques for feature extraction involve processing of intensity pixels by developing classification 
models for further analysis in images. Digital images with discrete pixels could be processed and 
manipulated to obtain discriminating features for efficient classification, detection, or identification 
of patterns of interest. Programmers and analysts can decide on how the information extracted from 
this object is processed by constructing models that could be used to measure the similarities and 
differences in image features. The developed models would transform all intensity pixels into 
meaningful information to expose all characteristic features for further processing. It depends on 
what you are trying to achieve, and how you intend to achieve them would also depend on the 
specification of the developed system and methods for gathering information during this process. 

1.2. Computation of Local Features 

The Higuchi’s method is another efficient way of calculating the fractal dimension of a curve that 
has found several applications in the analysis of time series [7]. Higuchi’s method is particularly 
suitable for a one-dimensional signal whose values at regular discrete intervals are available in the 
form x(/), i = 1, 2, ...N. Several new data point series can be constructed using an interval length, 
and starting value index t: 

Sti(p) = {xit), x(t + (p),x(t + 2(p), .x(t -F p(p)} (1) 

Where 

P = [^1 <2) 

The length of the series in (4) is calculated as a normalized sum of differences: 

lri(p)= +i<p) - x(t-F (i - l)(p) (3) 

The mean length for each interval length is obtained as 

;.(«>) = ;EJ,4('P) (4) 


As in the case of the box-counting dimension, the Higuchi dimension Dh is also computed as the 
slope of a linear regression line obtained using a log-log plot with log((p) along the x-axis, and log 
(L((p)) along the y-axis. 
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An MyM image I{x, y) must be converted to one-dimensional data before the above method can be 
applied. A common approach used for this is to add the values along each column to get a one¬ 
dimensional array of sums of pixel intensities: 


x(x) = Sy=iK^-y).; = 1.2, ...M. 


(5) 


1.3. Generalized Renyi Dimension 


The Higuehi’s dimension outlined above can be extended to a generalized family of dimensions 
called Renyi dimensions. These dimensions use a probability measure function p. In the context of 
the box- counting algorithm, p, represents the probability of finding a point of fractal within a box 
with index i. The Renyi dimensions Dq are defined with respect to a non-negative parameter q as 


Dn = 


q-1 S^O 


Urn 


10928 


( 6 ) 


As a special case of the above, when q becomes 0, we get the box counting dimension. In a fractal 
system the measured object is assumed to have an internal structure with different spatial scales; the 
number N(e) of features of certain size e scale as [8, 9]. 


1.4 Exact and Statistical Self-Similarity 

The most important geometrical characteristic exhibited by fractals is self- similarity, which is the 
property of invariance under certain scale transformations. A fractal on a plane can be viewed as a 
bounded set S of two-dimensional points. The set S is self-similar if it is the union of N non¬ 
overlapping subsets, each of which is congruent to scaled versions of [12]. Two sets of points are 
congruent if by using a similarity transformation consisting of scaling, rotations, transformations 
and reflections; one set can be transformed into an exact copy of the other. 

All real-world examples modelled using random fractals have statistical self-similarity. Here, 
different parts of a fractal cannot be made exactly congruent to the whole set even after an arbitrary 
rotational transformation and displacement. In this context, statistical self-similarity refers to the 
fact that enlargements of small constituent segments of a fractal have the same statistical distribution 
as the whole set [13]. In other words, parts of a random fractal can be matched with the whole set 
only in a statistical sense. In a broader context, statistical self-similarity refers to the characteristic 
of having a nearly constant measurement (within an allowable threshold) of certain statistical 
parameters derived from sets at various scales. 

Several methods of multi-fractal analysis of medical images have been suggested and evaluated in 
different ways [14-20]. [14] developed a two-pass algorithm for the computation of multi-fractal 
spectrum and used the calculated spectra for the classification in a tissue image database. In [5], the 
holder exponent for the power law approximation of intensity measures in pixel neighbourhoods 
has been used for resolving local density variations in the CT lung images. This paper provides a 
processing pipeline for describing digital images in detection and analysis of region of interest in 
medical features. 


2. Methodology 
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In this section, we give an overview of the computational stages in the analysis of images. The 
purpose of this section is to give an integrated view of the whole pipeline, showing the sequence of 
processes that should be implemented. In Figure 1, we provide a diagram of this processing pipeline 
consisting of three layers: the input layer, the computational layer and the output layer. Note that 
the output layer involves two types of a-histograms. The first a-histogram can be obtained directly 
from the a-image, where each a value is transformed into the range [0-255] and represented as a 
gray-level. The second a-histogram can be generated by discretizing the a -range [umin, amax] into n 
number of subintervals, and counting the number of pixels having a values in each subinterval. In 
order to evaluate the discriminating capability of the combined features from the two descriptors as 
demonstrated in Figure 1, different classification experiments have been investigated. Three 
Emphysema classes are defined with 50 images, assigned to each class. We employed two different 
image classifiers due to the nature of the datasets used in this study; the SVM, and random forests 
(RF) [21-24]. The RF could be a perfect classifier for the medical datasets since it is relatively robust 
to outliers and noise, and always enhance better accuracy when random features are used. The RF 
randomly selects inputs or a combination of inputs to grow each tree. This can significantly improve 
the classification accuracy by combining trees grown using random features, and the generalization 
error of the forests reduces as the number of trees becomes large [21]. 



Figure 1: Main steps in the computation of multi-fractal spectrum of an input image. 


3. Results and Discussion 
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SVMs have demonstrated highly competitive performance in many real-world applications, such as 
bioinformatics, face recognition and image processing. The results indicate that the global feature 
presents good classification performance, particularly with the RF classifier, though; the results 
obtained from the SVM are also good for the global features. As can be seen in Figure 1, it is noted 
from the results that the effects of the window size on the data sets are very obvious (Table 1). 

Table 1: Classification Results of Global Descriptor 


Classification accuracy - Global Feature 

Image Sizes 

SVM 

Random Forest 

64*64 

50.04% 

73.17% 

128 * 128 

52.08% 

81.52% 

256 * 256 

62.53% 

92.45% 

320 * 320 

61.16% 

94.38% 

384 * 384 

62.23% 

96.13 

512*512 

61.51% 

98.23% 


In other words, the data sets with the largest image sizes gave the best classification accuracy, using 
two different classifiers while the data sets with the smallest image pixel size gave the lowest 
performance. These results also demonstrate that as the image size is increasing the overall 
classification accuracy is also increasing, which indicates that the larger the window size, the higher 
the classification accuracy. This is simply because the images with larger sizes have captured more 
useful information and feature, which eventually increases the discriminating power of the features 
used in the classification process. The original size of the image pattern does not contain enough 
discriminative information, which can lead to a reduction in the performance accuracy of the 
classification system. Overall, the performance of the global descriptor looks very good as the 
classification accuracy falls within the range of 73.17-98.23%. Generally, the RF classifier 
performed better than the SVM in all cases, since the classification accuracy of the SVM classifier 
falls within the range of 50.04-61.51%. The classification results of the local descriptor using 
different image sizes with two different image classifiers are presented in Table 2. 

Table 2: Classification Results of Local Descriptor 


Classification accuracy - Local Descriptor 

Image Sizes 

SVM 

Random Forest 

64*64 

47.34% 

56.78% 

128 * 128 

48.25% 

47.67% 

256 * 256 

49.89% 

46.11% 


Performances of the local descriptor are generally lower than that of the global descriptor. Similarly, 
the scale invariance of different image sizes does not have any significant impact on classification 
accuracy. However, in the global descriptor, the effect of image sizes of the patches is very obvious, 
this is expected because the data from the local descriptor do not really require further invariance 
examination as this has been done during implementation. It has been verified experimentally that 
the execution run time to generate global features is about 5 times faster than that of local features 
using the same image size. Take for instance, for a 128*128 pixel size, the local image run time was 
around 4.5s whereas it took just 0.9s to generate the global image of the same image size. Similarly, 
for a 512 * 512, the execution run time for the local image was around 46.74s while global images 
can be calculated in just 8.7s. The experimental tests for the time complexities were carried out for 
four different image sizes; that is; 64*64, 128*128, 256*256 and 512*512 for comparison between 
local and global image features. Generally, the local images consume more computational time than 
the global image (Table 3). The details of the computational time for both feature descriptors are 
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presented in Table 3 using different image patches. The framework of this study provided solutions 
to a few research questions in the field of image classifications. For instance, how does the feature 
extracted from local and global descriptors behave under the same scale changes? How does the 
descriptor generally compare with the previous work using other textural classification approaches? 
It has been demonstrated in this research work the effectiveness of global descriptors in the 
classification of medical images. In the experimental analysis, we could see how powerful this 
descriptor is and how it behaves under different scale changes. 

This explains the importance of global features compared to the local features in detection of 
medical problems as the former behaves well under different image sizes. The second stage of the 
experiment compares the results obtained with the state-of-art method in recent article [16]. In Table 
1, our results compare favourably with the LBP results [16], the overall classification accuracy of 
98.23% from the global features with the RF classifier demonstrated that this descriptor performed 
better than the accuracy of 95.2% achieved by [16] for the classification of emphysema patterns. 
This shows that the features from the global descriptor have higher discriminative power than that 
of LBP. 


Table 3: Computational Time Comparison Between Global Image and Local Image 


Computational Time in seconds 

Image size 

Global 

image 

Local Image 

64 *64 

0.2957s 

1.4531s 

128 *128 

0.9113s 

4.5104s 

256 * 256 

2.3865s 

11.15s 

512*512 

8.7017s 

46.74s 


4. Conclusion 

In this study, global and local descriptors have been developed to extract the textural characteristics 
of digital images using some of the machine learning tools to identify regions or sections of medical 
patterns with severe problems. We have also demonstrated using the global descriptors that the 
window sizes of the image have great influence on the classification accuracy. We have analyzed 
the effect of different image sizes using two different descriptors. We have shown that the proposed 
descriptors could perform well even with big data sets, especially the global descriptor that has 
really shown excellent performances in all cases. This is a great achievement as the classification of 
large data sets involved in this experiment may sometimes pose some challenges that may affect the 
performances of the classification model. The combined feature that uses the global descriptor is 
better than the other descriptor (local) in terms of computational time and overall efficiency of data 
analysis. 
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